Skip to content

Conversation

@overlookmotel
Copy link
Contributor

@overlookmotel overlookmotel commented Nov 30, 2025

#144472 made str::floor_char_boundary a const function, but in doing so introduced a loop. This is unnecessary because the next UTF-8 character boundary can only be within a 4-byte range.

This produces excessive code for e.g. str.floor_char_boundary(20), because the loop is unrolled.

https://godbolt.org/z/5f3YsM6oK

This PR replaces the loop with 3 checks in series.

In addition to reducing code size in some cases, it also removes bounds checks from calling code when following floor_char_boundary with a call to String::truncate (which I assume might be a common pattern).

Notes

  • I tried using index.unchecked_sub(1), but found it doesn't remove bounds checks, whereas index.checked_sub(1).unwrap_unchecked() does. Surprising!

  • The assert_uncheckeds are required to elide checks from code following the floor_char_boundary call e.g.:

let index = string.floor_char_boundary(20);
string.truncate(index);

If this PR is accepted, I'll follow up with a similar PR for ceil_char_boundary.

Very happy to receive feedback and amend.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 30, 2025
@rustbot
Copy link
Collaborator

rustbot commented Nov 30, 2025

r? @scottmcm

rustbot has assigned @scottmcm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@clarfonthey
Copy link
Contributor

clarfonthey commented Nov 30, 2025

Rather than manually unrolling the code here, it might be worth considering if the original implementation can be made const instead:

let lower_bound = index.saturating_sub(3);
let new_index = self.as_bytes()[lower_bound..=index]
.iter()
.rposition(|b| b.is_utf8_char_boundary());

I had this exact thought in the original implementation, which is why I explicitly only checked the four following bytes for a character boundary. Note that I haven't fully scrutinised the code and yours may be better regardless, but I figured I'd mention this for some extra context. I would expect LLVM to automatically unroll a 4-max-iteration loop in most contexts.

Comment on lines +416 to +439
let bytes = self.as_bytes();
let boundary_index = if bytes[index].is_utf8_char_boundary() {
index
} else {
let mut i = index;
while i > 0 {
if self.as_bytes()[i].is_utf8_char_boundary() {
break;
// SAFETY: `bytes[index]` is a UTF-8 continuation byte, therefore there must be a byte before it.
// Note: `index.unchecked_sub(1)` would be preferable, but it doesn't the remove bounds check
// from `bytes[previous_index]`.
let previous_index = unsafe { index.checked_sub(1).unwrap_unchecked() };
if bytes[previous_index].is_utf8_char_boundary() {
previous_index
} else {
// SAFETY: `bytes[index - 1]` is a UTF-8 continuation byte, therefore there must be a byte before it
let previous_previous_index = unsafe { index.checked_sub(2).unwrap_unchecked() };
if bytes[previous_previous_index].is_utf8_char_boundary() {
previous_previous_index
} else {
// UTF-8 character sequences are at most 4 bytes long.
// `bytes[index - 2]`, `bytes[index - 1]`, and `bytes[index]` are all continuation bytes,
// therefore `bytes[index - 3]` must be the 1st byte in a 4-byte UTF-8 character sequence.
debug_assert!(bytes[index - 3].is_utf8_char_boundary());
index - 3
}
i -= 1;
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deep nesting isn't ideal. You could try matching slice patterns:

let bytes = &self.as_bytes()[..=index];
let boundary_index = match bytes {
    [.., b] if b.is_utf8_char_boundary() => index,
    [.., b, _] if b.is_utf8_char_boundary() => index - 1,
    [.., b, _, _] if b.is_utf8_char_boundary() => index - 2,
    [.., b, _, _, _] if b.is_utf8_char_boundary() => index - 3,
    // SAFETY: TODO
    _ => unsafe {
        debug_assert!(bytes[index.saturating_sub(3)].is_utf8_char_boundary());
        unreachable_unchecked()
    }
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much nicer. I'll check if it optimizes equally as well.

@okaneco
Copy link
Contributor

okaneco commented Nov 30, 2025

You might want to add a codegen test under tests/codegen-llvm to make sure the panics/slice fail checks that you're targeting are optimized out.

At the time, I tried to add a counter variable to limit the number of iterations but I wasn't able to remove the panics like the current implementation does.

slice_new and truncate_new still have panic paths when the index is variable. Variable index slice doesn't have a panic in the current implementation. Variable truncate has a panic for both implementations.
https://godbolt.org/z/T1TexYbE7

I wrote an example codegen test to help with testing this out (others may be able to improve it further).
You can run the tests with ./x test [path to file or directory] (save time by iterating over the IR/assembly in godbolt first).
Again, the variable truncate may not be possible to enable currently.
https://rustc-dev-guide.rust-lang.org/tests/compiletest.html#codegen-tests

Example codegen test file

// Test that panics and slice errors are elided from `floor_char_boundary` implementation.

//@ compile-flags: -Copt-level=3

#![crate_type = "lib"]

// CHECK-LABEL: @floor_fixed_index
#[no_mangle]
pub fn floor_fixed_index(s: &str) -> usize {
    // CHECK-NOT: panic
    s.floor_char_boundary(20)
}

// CHECK-LABEL: @floor_variable_index
#[no_mangle]
pub fn floor_variable_index(s: &str, i: usize) -> usize {
    // CHECK-NOT: panic
    s.floor_char_boundary(i)
}

// CHECK-LABEL: @floor_slice_fixed_index
#[no_mangle]
pub fn floor_slice_fixed_index(s: &str) -> &str {
    // CHECK-NOT: panic
    // CHECK-NOT: slice_error_fail
    let index = s.floor_char_boundary(20);
    &s[..index]
}

// CHECK-LABEL: @floor_slice_variable_index
#[no_mangle]
pub fn floor_slice_variable_index(s: &str, i: usize) -> &str {
    // CHECK-NOT: panic
    // CHECK-NOT: slice_error_fail
    let index = s.floor_char_boundary(i);
    &s[..index]
}

// CHECK-LABEL: @truncate_from_floor_fixed_index
#[no_mangle]
pub fn truncate_from_floor_fixed_index(s: &mut String) {
    // CHECK-NOT: panic
    let index = s.floor_char_boundary_new(20);
    s.truncate(index)
}

// CHECK-LABEL: @truncate_from_floor_variable_index
#[no_mangle]
pub fn truncate_from_floor_variable_index(s: &mut String, i: usize) {
    // CHECK-NOT: panic
    let index = s.floor_char_boundary_new(i);
    s.truncate(index)
}

@overlookmotel
Copy link
Contributor Author

slice_new and truncate_new still have panic paths when the index is variable. Variable index slice doesn't have a panic in the current implementation. Variable truncate has a panic for both implementations.

Yes, I noticed this today. Am hoping to eliminate all these panics. I feel like it should be possible, but... mysteries of the compiler...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants